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DETAILED ACTION 
- Response to Amendment 

1 . Applicant's arguments have been considered but are moot in view of the new 
ground(s) of rejection in view of Galanes et al. (US 7260535), necessitated by claim 
amendment and introduction of new claims. 

Claim Rejections - 35 USC § 103 

2. The following is a quotation of 35 U.S.C. 103(a) which fonns the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3. Claims 1 8-21 , 23-24, 34, 38-42, 51-55, 57-60. 65-69, and 72-79 are rejected 
under 35 U.S.C. 103(a) as being unpatentable over Kriechbaum et al. (US 6975985) in 
view of Galanes et al. (US 7260535). 

4. Regarding claim 18, Kriechbaum et al. disclose a method of testing a speech 
recognizer, the method comprising: 

receiving a selected portion of a digital audio data file (element 300 in figure 4); 

receiving a grammar having a set of responses expected to occur in the selected 
portion {SRS or speech recognition system 500 in figure 4 inherently includes a set of 
grammar); 
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based at least in part on the selected portion and the grammar, producing a 
decode result of the selected portion (result of the speech recognition system 500 in 
figure 4 is the decoded result) ; 

receiving a transcript of the selected portion {True transcript 520 in figure 4); and 
scoring the decode result based at least in part on the transcript {Aligner 550 in 
figure 4). 

Kriechbaum et al. fail to specifically disclose each audio file comprising audio 
recorded in response to a first prompt by a speech recognition application; and 
receiving a grammar associated with the first prompt, the grammar comprising a 
plurality of concepts, each concept having a set of phrases organized under a single 
idea, the idea representing an expected response to the first prompt. However, 
Galanes et al. teach each audio file comprising audio recorded in response to a first 
prompt by a speech recognition application (coA 17, line 30 to col. 18-67, user is 
prompted for speech input); and receiving a grammar associated with the first prompt, 
the grammar comprising a plurality of concepts, each concept having a set of phrases 
organized under a single idea, the idea representing an expected response to the first 
prompt {col. 17, line 30 to col. 18-67, grammars are in association with prompt; so^when 
a particular prompt is activated, its associated grammars are also activated for used by 
the speech recognizer; and the grammars includes a plurality of concepts (e.g. 
departure city, date, time etc.). 

Since Kriechbaum et al. and Galanes et al. are analogous art because they are 
from the same field of endeavor, it would have been obvious to one of ordinary skill in 
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the art at the time of invention to modify Kriechbaum et al. by incorporating the teaching 
of Galanes et al. in order to improve speech recognition accuracy. 

5. Regarding claims 34 and 57, Kriechbaum et al. disclose a system for testing a 
speech recognizer, the system comprising: 

an audio recorder module for receiving digital audio input {element 300 in fig, 4); 

a grammar editor module configured to access and allow modification of a 
grammar, the grammar comprising words, phrases, or phonemes expected to appear in 
the audio input {SRS or speech recognition system 500 in figure 4 inherently includes a 
set of grammar, and the set of grammar may contain words, phrases, or phonemes)] 

a speech recognition engine configured to output a recognition result based on 
the audio input and the accessed grammar {result of the speech recognition system 500 
in figure 4 is the decoded result)] and 

a scoring module configured to score the recognition result based at least in part 
on a user-defined transcript of the audio input and the recognition result {Aligner 550 in 
figure 4 aligns true transcript 520 with the decoded result). 

Kriechbaum et al. fail to specifically disclose each audio file comprising audio 
recorded in response to a first prompt by a speech recognition application; and 
receiving a grammar associated with the first prompt, the grammar comprising a 
plurality of concepts, each concept having a set of phrases organized under a single 
idea, the idea representing an expected response to the first prompt. However, 
Galanes etal. teach each audio file comprising audio recorded in response to a first 
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prompt by a speech recognition application {col. 17, line 30 to col. 18-67, user is 
prompted for speech input); and receiving a grammar associated with the first prompt, 
the grammar comprising a plurality of concepts, each concept having a set of phrases 
organized under a single idea, the idea representing an expected response to the first 
prompt {col. 17, line 30 to col. 18-67, grammars are in association with prompt; so when 
a particular prompt is activated, its associated grammars are also activated for used by 
the speech recognizer; and the grammars includes a plurality of concepts (e.g. 
departure city, date, time etc.). 

Since Kriechbaum et al. and Galanes et al. are analogous art because they are 
from the same field of endeavor, it would have been obvious to one of ordinary skill in 
the art at the time of invention to modify Kriechbaum et al. by incorporating the teaching 
of Galanes et al. in order to improve speech recognition accuracy. 

6. Regarding claims 20-21 , 23-24, and 72, Kriechbaum et al. further disclose the 
method of Claim 18, wherein the set of responses comprises concepts, phrases, words, 
and/or phonemes {SRS or speech recognition system 500 in figure 4 inherently includes 
a set of grammar, and the set of grammar may contain words, phrases, or phonemes), 
wherein the decode result comprises concepts, phrases, words, and/or phonemes 
{inherent feature in a speech recognition system), wherein the decode result comprises 
a confidence score {inherent in speech recognition system), creating and/or modifying a 
response file associated with the audio data file {col. 3, lines 60-67), and wherein the 
response file comprises the audio file, a portion of the grammar associated with the 
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audio file, the decode result, and/or the transcript {within the scope of the reference), 
and transmitting the decoded result to a tuner module for processing {referring to figure 
4). 



7. Regarding claims 38-42 and 65-69, Kriechbaum et al. further disclose the system 
of claims 34 and 58, respectively, wherein the recognition result comprises a confidence 
score {inherent in speech recognition sysfen?), wherein the recognition result comprises 
a concept, phrase, word, or phoneme, wherein the recognition result comprises an 
indication of an acoustic model used by the speech recognizer in decoding the audio 
input, wherein the recognition result comprises an acoustic model score {SRS or 
speech recognition system 500 in figure 4 inherently includes a set of grammar, and the 
set of grammar may contain words, phrases, or phonemes), and further comprising a 
response file for logically associating the audio input, the transcript, the recognition 
result, and/or an output of the scoring module {referring to figure 4). 

8. Regarding claims 51-55, Kriechbaum et al. further disclose the system of claim 
34, wherein the speech recognition engine is configured to transmit the recognition 
result to a tuner module for processing {referring to figure 4), the tuner module 
configured to transmit digital audio input to the audio recorder module and grammar to 
the grammar editor module {referring to figure 4, within the scope of the reference), 
further comprising a test module configured to initiate a testing cycle by processing and 
transmitting digital audio input and grammar to the speech recognition engine {referring 
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to speech recognition system 500 in figure 4 inherently includes grammars and speech 
models for comparing with the input speech), wherein the speech recognition engine is 
configured to transmit the recognition result to a tuner module for processing {referring 
to figure 4, within the scope of the reference), the tuner module configured to 
transmit digital audio data and grammar to the test module (referring to figure 4, within 
the scope of the reference). 

9. Regarding claims 58-60, Kriechbaum et al. further disclose the system of claim 
57, further comprising a speech recognition engine configured to output a recognition 
result to the scoring module based on input received from the test module {referring to 
figure 4), wherein the speech recognition engine is configured to transmit the 
recognition result to a tuner module for processing {referring to figure 4), further 
comprising a tuner module configured to transmit digital audio data and grammar to the 
test module {referring to figure 4, within the scope of the reference). 

1 0. Regarding claim 73, Kriechbaum et al. further disclose the method of claim 1 8, 
further comprising: producing a second decode result of each digital audio file based at 
least in part on the modified grammar {the same as the first decode result in claim 1 if 
the user uses the system the second time); and scoring the second decode results 
based at least in part on the transcript of each audio file {the same as the first decode 
result in claim 1 if the user uses the system the second time). Kriechbaum et al. fail to 
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specifically disclose modifying the grammar. However, Galanes et al. further teach the 
step of modifying the grammar {col. 111, lines 60-67). 

Since Kriechbaum et al. and Galanes et al. are analogous art because they are 
from the same field of endeavor, it would have been obvious to one of ordinary skill in 
the art at the time of invention to modify Kriechbaum et al. by incorporating the teaching 
of Galanes et al. in order to improve speech recognition accuracy. 

1 1 . Regarding claims 74-75, Kriechbaum et al. further disclose the method of Claim 
73, further comprising comparing the scoring of the first decode results and the scoring 
of the second decode results {the same as the first decode result in claim 1 if the user 
uses the system the second time), wherein each of the set of phrases comprises a 
word, a word block, a BNF construct, or a phoneme block {within the scope of the 
reference). 

12. Regarding claim 76, Kriechbaum et al. further disclose the method of Claim 18, 
further comprising: receiving a second plurality of digital audio data files, each audio file 
comprising audio recorded in response to a second prompt by the speech recognition 
application {the same as the first decode result in claim 1 if the user uses the system 
the second time); receiving a second grammar associated with the second prompt, 
wherein the second grammar comprises a plurality of concepts, each concept having a 
set of phrases organized under a single idea, the idea representing an expected 
response to the second prompt; producing a second decode result for each audio file in 



Application/Control Number: Page 9 

10/725,281 

Art Unit: 2626 

the second plurality of digital audio data files based at least in part on the second 
grammar; receiving a transcript of each audio file in the second plurality of audio data 
files; and scoring the second decode results based at least in part on the transcripts of 
each of the second plurality of digital audio files {the same as the first decode result in 
claim 1 if the user uses the system the second time). 

1 3. Regarding claims 77-79, Kriechbaum et al. further disclose the method of claim 
18, wherein scoring the decode results comprises generating statistics on the accuracy 
of the decode results with respect to each transcript, the statistics comprising word error 
rate, concept error rate, and average confidence scores for correct and incorrect results, 
wherein the system is configured to, iteratively, modify the grammar based on a 
previous scoring of recognifion results using the grammar editor module, output a 
recognition result for each audio data file based on the modified grammar using the 
speech recognition engine, and use the user-defined transcript of each audio data file to 
score the modified grammar recognifion results using the scoring module, wherein the 
grammar editor module is further configured to modify the grammar based on the 
scoring of the recognition results, the test module is further configured to transmit the 
plurality of audio data files and the modified grammar to a speech recognifion engine, 
and the scoring module is further configured to receive a recognition result based on the 
modified grammar from the speech recognition engine for each of the plurality of audio 
data files and to score the recognition results based at least in part on the user-defined 
transcript {within the scope of the reference). 
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14. Claims 22, 35-37, 56, 61-64, and 70-71 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Kriechbaum et al. (US 6975985) in view of Galanes et al. (US 
7260535), and further in view of Official Notice. 

1 5. Regarding claims 22, 35-37, 56, 61-64, and 70-71 , Kriechbaum et al. fail to 
specifically disclose a user interface, wherein the user interface comprises a graphical 
user interface, wherein the graphical user interface is configured to display an output 
from a scoring module configured to score the recognition result based at least in part 
on a user-defined transcript of the audio input and the recognition result, and wherein 
the graphical user interface is configured to display the digital audio input and the 
accessed grammar. However, examiner takes official notice that such user interface is 
well known in the art, particularly in computer system, where speech recognition is 
performed on. The method of displaying recognized result is also well known. One 
particular advantage of displaying recognized result is for the user to proofread the 
transcribed text. 

Conclusion 

Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly. THIS ACTION IS MADE FINAL. See MPEP 
§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 
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A shortened statutory period for reply to this final action Is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Huyen X. Vo whose telephone number Is 571-272-7631. 
The examiner can normally be reached on M-F, 9-5:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick Edouard can be reached on 571-272-7603. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more infonnation about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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